AITopics | grasp type

Collaborating Authors

grasp type

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

HOH: Markerless Multimodal Human-Object-Human Handover Dataset with Large Object Count

Neural Information Processing SystemsApr-29-2026, 23:07:01 GMT

We present the HOH (Human-Object-Human) Handover Dataset, a large object count dataset with 136 objects, to accelerate data-driven research on handover studies, human-robot handover implementation, and artificial intelligence (AI) on handover parameter estimation from 2D and 3D data of two-person interactions. HOH contains multi-view RGB and depth data, skeletons, fused point clouds, grasp type and handedness labels, object, giver hand, and receiver hand 2D and 3D segmentations, giver and receiver comfort ratings, and paired object metadata and aligned 3D models for 2,720 handover interactions spanning 136 objects and 20 giver-receiver pairs--40 with role-reversal--organized from 40 participants. We also show experimental results of neural networks trained using HOH to perform grasp, orientation, and trajectory prediction. As the only fully markerless handover capture dataset, HOH represents natural human-human handover interactions, overcoming challenges with markered datasets that require specific suiting for body tracking, and lack high-resolution hand tracking. To date, HOH is the largest handover dataset in terms of object count, participant count, pairs with role reversal accounted for, and total interactions captured.

artificial intelligence, interaction, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.68)
Europe (0.46)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.35)

Add feedback

HOH: Markerless Multimodal Human-Object-Human Handover Dataset with Large Object Count

Neural Information Processing SystemsFeb-17-2026, 10:08:37 GMT

The large-scale propagation of human-human handover research has been hindered by two challenges.

artificial intelligence, interaction, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Europe > Italy > Lazio > Rome (0.04)
Europe > Germany > Berlin (0.04)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

OmniDexVLG: Learning Dexterous Grasp Generation from Vision Language Model-Guided Grasp Semantics, Taxonomy and Functional Affordance

Zhang, Lei, Zheng, Diwen, Bai, Kaixin, Bing, Zhenshan, Marton, Zoltan-Csaba, Chen, Zhaopeng, Knoll, Alois Christian, Zhang, Jianwei

arXiv.org Artificial IntelligenceDec-4-2025

Dexterous grasp generation aims to produce grasp poses that align with task requirements and human interpretable grasp semantics. However, achieving semantically controllable dexterous grasp synthesis remains highly challenging due to the lack of unified modeling of multiple semantic dimensions, including grasp taxonomy, contact semantics, and functional affordance. To address these limitations, we present OmniDexVLG, a multimodal, semantics aware grasp generation framework capable of producing structurally diverse and semantically coherent dexterous grasps under joint language and visual guidance. Our approach begins with OmniDexDataGen, a semantic rich dexterous grasp dataset generation pipeline that integrates grasp taxonomy guided configuration sampling, functional affordance contact point sampling, taxonomy aware differential force closure grasp sampling, and physics based optimization and validation, enabling systematic coverage of diverse grasp types. We further introduce OmniDexReasoner, a multimodal grasp type semantic reasoning module that leverages multi agent collaboration, retrieval augmented generation, and chain of thought reasoning to infer grasp related semantics and generate high quality annotations that align language instructions with task specific grasp intent. Building upon these components, we develop a unified Vision Language Grasping generation model that explicitly incorporates grasp taxonomy, contact structure, and functional affordance semantics, enabling fine grained control over grasp synthesis from natural language instructions. Extensive experiments in simulation and real world object grasping and ablation studies demonstrate that our method substantially outperforms state of the art approaches in terms of grasp diversity, contact semantic diversity, functional affordance diversity, and semantic consistency.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2512.03874

Genre:

Research Report > New Finding (0.68)
Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Robots > Manipulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

ZeroDexGrasp: Zero-Shot Task-Oriented Dexterous Grasp Synthesis with Prompt-Based Multi-Stage Semantic Reasoning

Jian, Juntao, Wei, Yi-Lin, Mou, Chengjie, Lin, Yuhao, Zhu, Xing, Shen, Yujun, Zheng, Wei-Shi, Hu, Ruizhen

arXiv.org Artificial IntelligenceNov-18-2025

Task-oriented dexterous grasping holds broad application prospects in robotic manipulation and human-object interaction. However, most existing methods still struggle to generalize across diverse objects and task instructions, as they heavily rely on costly labeled data to ensure task-specific semantic alignment. In this study, we propose \textbf{ZeroDexGrasp}, a zero-shot task-oriented dexterous grasp synthesis framework integrating Multimodal Large Language Models with grasp refinement to generate human-like grasp poses that are well aligned with specific task objectives and object affordances. Specifically, ZeroDexGrasp employs prompt-based multi-stage semantic reasoning to infer initial grasp configurations and object contact information from task and object semantics, then exploits contact-guided grasp optimization to refine these poses for physical feasibility and task alignment. Experimental results demonstrate that ZeroDexGrasp enables high-quality zero-shot dexterous grasping on diverse unseen object categories and complex task requirements, advancing toward more generalizable and intelligent robotic grasping.

artificial intelligence, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2511.13327

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

BiDexHand: Design and Evaluation of an Open-Source 16-DoF Biomimetic Dexterous Hand

Weng, Zhengyang Kris

arXiv.org Artificial IntelligenceOct-7-2025

Achieving human-level dexterity in robotic hands remains a fundamental challenge for enabling versatile manipulation across diverse applications. This extended abstract presents BiDexHand, a cable-driven biomimetic robotic hand that combines human-like dexterity with accessible and efficient mechanical design. The robotic hand features 16 independently actuated degrees of freedom and 5 mechanically coupled joints through novel phalange designs that replicate natural finger motion. Performance validation demonstrated success across all 33 grasp types in the GRASP Taxonomy, 9 of 11 positions in the Kapandji thumb opposition test, a measured fingertip force of 2.14\,N, and the capability to lift a 10\,lb weight. As an open-source platform supporting multiple control modes including vision-based teleoperation, BiDexHand aims to democratize access to advanced manipulation capabilities for the broader robotics research community.

artificial intelligence, bidexhand, robotic hand, (14 more...)

arXiv.org Artificial Intelligence

2504.14712

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.47)

Technology: Information Technology > Artificial Intelligence > Robots > Manipulation (1.00)

Add feedback

Lang2Morph: Language-Driven Morphological Design of Robotic Hands

Qiao, Yanyuan, Gilday, Kieran, Xie, Yutong, Hughes, Josie

arXiv.org Artificial IntelligenceSep-24-2025

Designing robotic hand morphologies for diverse manipulation tasks requires balancing dexterity, manufacturability, and task-specific functionality. While open-source frameworks and parametric tools support reproducible design, they still rely on expert heuristics and manual tuning. Automated methods using optimization are often compute-intensive, simulation-dependent, and rarely target dexterous hands. Large language models (LLMs), with their broad knowledge of human-object interactions and strong generative capabilities, offer a promising alternative for zero-shot design reasoning. In this paper, we present Lang2Morph, a language-driven pipeline for robotic hand design. It uses LLMs to translate natural-language task descriptions into symbolic structures and OPH-compatible parameters, enabling 3D-printable task-specific morphologies. The pipeline consists of: (i) Morphology Design, which maps tasks into semantic tags, structural grammars, and OPH-compatible parameters; and (ii) Selection and Refinement, which evaluates design candidates based on semantic alignment and size compatibility, and optionally applies LLM-guided refinement when needed. We evaluate Lang2Morph across varied tasks, and results show that our approach can generate diverse, task-relevant morphologies. To our knowledge, this is the first attempt to develop an LLM-based framework for task-conditioned robotic hand design.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2509.18937

Country: Asia > Middle East > UAE (0.28)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

HOGraspFlow: Exploring Vision-based Generative Grasp Synthesis with Hand-Object Priors and Taxonomy Awareness

Shi, Yitian, Guo, Zicheng, Wolf, Rosa, Welte, Edgar, Rayyes, Rania

arXiv.org Artificial IntelligenceSep-23-2025

We propose Hand-Object\emph{(HO)GraspFlow}, an affordance-centric approach that retargets a single RGB with hand-object interaction (HOI) into multi-modal executable parallel jaw grasps without explicit geometric priors on target objects. Building on foundation models for hand reconstruction and vision, we synthesize $SE(3)$ grasp poses with denoising flow matching (FM), conditioned on the following three complementary cues: RGB foundation features as visual semantics, HOI contact reconstruction, and taxonomy-aware prior on grasp types. Our approach demonstrates high fidelity in grasp synthesis without explicit HOI contact input or object geometry, while maintaining strong contact and taxonomy recognition. Another controlled comparison shows that \emph{HOGraspFlow} consistently outperforms diffusion-based variants (\emph{HOGraspDiff}), achieving high distributional fidelity and more stable optimization in $SE(3)$. We demonstrate a reliable, object-agnostic grasp synthesis from human demonstrations in real-world experiments, where an average success rate of over $83\%$ is achieved.

artificial intelligence, hograspflow, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2509.16871

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Using Visual Language Models to Control Bionic Hands: Assessment of Object Perception and Grasp Inference

Karaali, Ozan, Farag, Hossam, Dosen, Strahinja, Stefanovic, Cedomir

arXiv.org Artificial IntelligenceSep-18-2025

This study examines the potential of utilizing Vision Language Models (VLMs) to improve the perceptual capabilities of semi-autonomous prosthetic hands. We introduce a unified benchmark for end-to-end perception and grasp inference, evaluating a single VLM to perform tasks that traditionally require complex pipelines with separate modules for object detection, pose estimation, and grasp planning. To establish the feasibility and current limitations of this approach, we benchmark eight contemporary VLMs on their ability to perform a unified task essential for bionic grasping. From a single static image, they should (1) identify common objects and their key properties (name, shape, orientation, and dimensions), and (2) infer appropriate grasp parameters (grasp type, wrist rotation, hand aperture, and number of fingers). A corresponding prompt requesting a structured JSON output was employed with a dataset of 34 snapshots of common objects. Key performance metrics, including accuracy for categorical attributes (e.g., object name, shape) and errors in numerical estimates (e.g., dimensions, hand aperture), along with latency and cost, were analyzed. The results demonstrated that most models exhibited high performance in object identification and shape recognition, while accuracy in estimating dimensions and inferring optimal grasp parameters, particularly hand rotation and aperture, varied more significantly. This work highlights the current capabilities and limitations of VLMs as advanced perceptual modules for semi-autonomous control of bionic limbs, demonstrating their potential for effective prosthetic applications.

large language model, machine learning, orientation, (20 more...)

arXiv.org Artificial Intelligence

2509.13572

Country: Europe > Denmark (0.15)

Genre: Research Report > Experimental Study (0.34)

Industry:

Health & Medicine > Health Care Technology (0.48)
Health & Medicine > Therapeutic Area > Orthopedics/Orthopedic Surgery (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.33)

Add feedback

OVGrasp: Open-Vocabulary Grasping Assistance via Multimodal Intent Detection

Hu, Chen, Luo, Shan, Gionfrida, Letizia

arXiv.org Artificial IntelligenceSep-5-2025

Grasping assistance is essential for restoring autonomy in individuals with motor impairments, particularly in unstructured environments where object categories and user intentions are diverse and unpredictable. We present OVGrasp, a hierarchical control framework for soft exoskeleton-based grasp assistance that integrates RGB-D vision, open-vocabulary prompts, and voice commands to enable robust multimodal interaction. To enhance generalization in open environments, OVGrasp incorporates a vision-language foundation model with an open-vocabulary mechanism, allowing zero-shot detection of previously unseen objects without retraining. A multimodal decision-maker further fuses spatial and linguistic cues to infer user intent, such as grasp or release, in multi-object scenarios. We deploy the complete framework on a custom egocentric-view wearable exoskeleton and conduct systematic evaluations on 15 objects across three grasp types. Experimental results with ten participants demonstrate that OVGrasp achieves a grasping ability score (GAS) of 87.00%, outperforming state-of-the-art baselines and achieving improved kinematic alignment with natural hand motion.

detection, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.04324

Country:

Europe > United Kingdom (0.46)
Asia > China (0.28)
Europe > Switzerland (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.68)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
(4 more...)

Add feedback

Dexonomy: Synthesizing All Dexterous Grasp Types in a Grasp Taxonomy

Chen, Jiayi, Ke, Yubin, Peng, Lin, Wang, He

arXiv.org Artificial IntelligenceSep-4-2025

Generalizable dexterous grasping with suitable grasp types is a fundamental skill for intelligent robots. Developing such skills requires a large-scale and high-quality dataset that covers numerous grasp types (i.e., at least those categorized by the GRASP taxonomy), but collecting such data is extremely challenging. Existing automatic grasp synthesis methods are often limited to specific grasp types or object categories, hindering scalability. This work proposes an efficient pipeline capable of synthesizing contact-rich, penetration-free, and physically plausible grasps for any grasp type, object, and articulated hand. Starting from a single human-annotated template for each hand and grasp type, our pipeline tackles the complicated synthesis problem with two stages: optimize the object to fit the hand template first, and then locally refine the hand to fit the object in simulation. To validate the synthesized grasps, we introduce a contact-aware control strategy that allows the hand to apply the appropriate force at each contact point to the object. Those validated grasps can also be used as new grasp templates to facilitate future synthesis. Experiments show that our method significantly outperforms previous type-unaware grasp synthesis baselines in simulation. Using our algorithm, we construct a dataset containing 10.7k objects and 9.5M grasps, covering 31 grasp types in the GRASP taxonomy. Finally, we train a type-conditional generative model that successfully performs the desired grasp type from single-view object point clouds, achieving an 82.3% success rate in real-world experiments. Project page: https://pku-epic.github.io/Dexonomy.

artificial intelligence, grasp type, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2504.18829

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Manipulation (0.93)

Add feedback